A novel diagnostic model for tuberculous meningitis using Bayesian latent class analysis

Background Diagnosis of tuberculous meningitis (TBM) is hampered by the lack of a gold standard. Current microbiological tests lack sensitivity and clinical diagnostic approaches are subjective. We therefore built a diagnostic model that can be used before microbiological test results are known. Methods We included 659 individuals aged \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ge 16$$\end{document}≥16 years with suspected brain infections from a prospective observational study conducted in Vietnam. We fitted a logistic regression diagnostic model for TBM status, with unknown values estimated via a latent class model on three mycobacterial tests: Ziehl–Neelsen smear, Mycobacterial culture, and GeneXpert. We additionally re-evaluated mycobacterial test performance, estimated individual mycobacillary burden, and quantified the reduction in TBM risk after confirmatory tests were negative. We also fitted a simplified model and developed a scoring table for early screening. All models were compared and validated internally. Results Participants with HIV, miliary TB, long symptom duration, and high cerebrospinal fluid (CSF) lymphocyte count were more likely to have TBM. HIV and higher CSF protein were associated with higher mycobacillary burden. In the simplified model, HIV infection, clinical symptoms with long duration, and clinical or radiological evidence of extra-neural TB were associated with TBM At the cutpoints based on Youden’s Index, the sensitivity and specificity in diagnosing TBM for our full and simplified models were 86.0% and 79.0%, and 88.0% and 75.0% respectively. Conclusion Our diagnostic model shows reliable performance and can be developed as a decision assistant for clinicians to detect patients at high risk of TBM. Summary Diagnosis of tuberculous meningitis is hampered by the lack of gold standard. We developed a diagnostic model using latent class analysis, combining confirmatory test results and risk factors. Models were accurate, well-calibrated, and can support both clinical practice and research. Supplementary Information The online version contains supplementary material available at 10.1186/s12879-024-08992-z.


Background
Tuberculous meningitis (TBM) is the most lethal form of Mycobacterium tuberculosis (Mtb) infection [1].Early diagnosis is critical to initiate appropriate therapy, especially for low-middle-income countries, where access to diagnostic tests is limited.Nevertheless, diagnosing TBM is notoriously challenging as no gold standard exists.A confirmatory diagnosis of TBM requires identification of Mtb in cerebrospinal fluid (CSF), but conventional tests -Ziehl-Neelsen smear (ZN-Smear) microscopy, Gen-eXpert MTB/RIF (Xpert), and culture by Mycobacteria Growth Indicator Tube (MGIT) -lack sensitivity due to the challenge of Mtb identification in paucibacillary CSF [1,2].
A diagnosis of TBM is often guided by a combination of clinical, biochemical, and imaging features.TBM is suggested by a long duration of symptoms at presentation and abnormal CSF biomarker values.A number of diagnostic algorithms have been created that utilise this information [3][4][5][6].However, many of these have not been widely validated, especially for those individuals for whom the true disease status could not be verified.
In 2010, a uniform case definition (UCD) was developed based on expert opinion [5].This illustrated the diagnostic challenges presented by TBM and was intended to facilitate comparison of research data, rather than for use in clinical practice.The UCD distinguishes between four diagnostic levels: definite, probable, possible, and not TBM.Individuals with microbiological or molecular confirmation of Mtb within the CSF are allocated to definite TBM.The remaining cases are defined as probable or possible, encompassing an expert-generated grading system where higher scores designate an increased probability of a diagnosis of TBM.It is expected that true TBM cases are represented by all definite cases, most probable cases, and some possible cases.Since TBM is almost always fatal if not treated with antituberculosis drugs and delayed treatment is strongly associated with death or severe neurological sequelae in survivors, physicians usually err on the side of TBM treatment for those with compatible symptoms despite ongoing diagnostic uncertainty.
In social science and psychology, a technique named Latent class analysis (LCA) or Latent class model (LCM) has been employed that has the ability to infer "hidden groups" (called latent classes) based on "observed indicators".The individual allocation into each group can be improved by relating it to "additional features".LCA was later adopted in diagnostic studies to answer a similar problem when there is no practical gold standard to determine the disease status, as in the case of pulmonary tuberculosis and TBM [7][8][9][10].
The rationale and formulation of our LCA are discussed in the clinical and statistical supplementary appendices.In our study, we defined two "hidden groups", those infected with Mtb and those infected with another pathogen, and the "observed indicators" are the three confirmatory tests (ZN-Smear, MGIT, and Xpert).The "additional features", herein referred to as "diagnostic features" in our study, are the clinical, biochemical and imaging diagnostic features as used in the uniform case definition in combination with a few characteristics that are strongly indicative of not having TBM.A full list of included diagnostic features are outlined in the Clinical supplementary appendix.The main purpose of our study is to use the confirmatory test results and the diagnostic features to estimate TBM status per individual and develop a calibrated scoring system that quantifies the risk of TBM based on the diagnostic features.Our model allowed us to re-evaluate the performance of each confirmatory test.To improve clinical usability, we also developed a simplified scoring system requiring only clinical information and chest X-ray, without CSF analysis.

Participants
We used data from a prospective observational study on individuals with suspected brain infections who were admitted to the neuro-infection ward of the Hospital for Tropical Diseases (HTD), Ho Chi Minh City, Vietnam [11].The HTD is a 550-bed centre providing secondary and tertiary treatment for a wide range of tropical infections in Southern Vietnam [3].The study received ethical approval from HTD and the Oxford Tropical Research Ethics Committee and written informed consent was obtained from all participants or their relatives if they were incapacitated [11].
Participants were ≥ 16 years old and were enrolled between 29th August 2017 and 22nd January 2021.All were suspected of neurological infection and underwent lumbar puncture at baseline as a routine diagnostic procedure.Patients were ineligible for enrolment if performing a lumbar puncture was contraindicated and excluded from our analysis if the mycobacterial cultures of CSF taken within the first week were contaminated.

Clinical and imaging data
Demographics (age, gender) and relevant medical history were collected.HIV tests were only conducted on those with identified risk factors for HIV infection.All participants underwent a chest X-ray.Brain imaging was not performed routinely and not included in the database.

Cerebrospinal fluid (CSF) analysis
White cell count (WCC) and cellular differential, protein, glucose (with paired blood glucose taken at the same time), and lactate were measured at enrolment.A Gram stain was performed to screen for bacterial meningitis, PCR/serology for viral meningitis, PCR for Angiostrongylus cantonensis, and India-ink staining and a cryptococcal antigen lateral flow test if cryptococcal meningitis was suspected.Where possible at least 6 mL CSF was used for mycobacterial testing by ZN-Smear, MGIT, and either Xpert MTB/RIF or Xpert MTB/RIF Ultra (Xpert Ultra).We combined Xpert Ultra with Xpert as they were diagnostically comparable for TBM in our setting [11].In case an insufficient sample (< 5 mL) of CSF was collected, the microbiological tests were done according to case-bycase clinical judgement.In brief, if there was no clinical suggestion of TBM, the priority might be shifted to finding other diagnoses.If CSF was repeatedly sampled (as directed by clinical need), only the first sample was used as long as it had at least 3 mL of CSF and was not collected later than 7 days since enrolment.Methods of CSF processing have been described elsewhere [11].

Diagnosis and treatment
All patients received treatment according to the national and local guidelines.At the time of discharge or death, all were given a final diagnosis, based on the available clinical and laboratory information, including treatment response.If at least one of ZN-Smear, Xpert, or MGIT from CSF was positive at any time, the patient had definite TBM.Patients had suspected TBM if confirmatory microbiological and molecular tests were negative, but TBM was clinically suggested and anti-TB drugs were started.Those who recovered without anti-TB chemotherapy or had an alternative diagnosis confirmed microbiologically (e.g., by culture, PCR, or antigen tests) were assigned another diagnosis (i.e., not TBM).

Statistical analysis
Our latent class model for TBM has two components, both consisting of logistic regression models (Fig. 1).In the indicator model, we made the results of three observed confirmatory tests ZN-Smear, MGIT, and Xpert depending on TBM status.The prevalence model quantifies the probability to have TBM.This probability is purely based on the diagnostic features, prior to any confirmatory tests.As the three tests share similar mechanisms of detecting the presence of Mtb, we added an individual random effect -which we call the mycobacillary burden -to eliminate their collinearity [7,12].The latent mycobacillary burden was related to a set of modulating factors (bacillary burden sub-model).
The choice of diagnostic features and modulating factors was based on prior knowledge of their association with TBM status and bacterial burden [5] (Clinical supplementary appendix).We added three indicators of an alternative diagnosis to the prevalence model: (1) positive CSF eosinophil count-a strong biomarker for eosinophilic meningitis, a relatively common condition in Vietnam, usually caused by Angiostrongylus cantonensis; (2) positive CSF Indian-ink staining or cryptococcal antigen lateral flow test; and (3) positive CSF Gram stain for nonacid-fast bacterial meningitis.We also included CSF red cell count as it is a marker of a traumatic lumbar puncture, which requires corrections to WCC and biochemical features [13][14][15].A complete formulation of our model design is reported in the Statistical supplementary appendix.
We used the individual estimated TBM risk from the above model to fit a simplified prevalence model that excluded laboratory information.In their stead, we added some simple clinical symptoms involved in a TBM diagnosis: fever, headache, neck stiffness, and psychosis.These variables were easy to measure and less prone to missingness.This allowed us to create a simple risk score table for quick screening.Furthermore, we used Bayes' rule to calculate the change in probability of having TBM after retrieving results of some or all confirmatory tests, relative to the pre-test TBM probability (Clinical supplementary appendix).Because any positive test means definite TBM, and because MGIT culture results take 2-8 weeks, we limited to five scenarios where negative confirmatory test results become available: Missing values were handled based on their expected missingness mechanism (Clinical supplementary appendix).We later tested their validity by conducting appropriate sensitivity analyses in the Statistical supplementary appendix.
We chose a Bayesian approach for estimation and incorporated prior knowledge [1,16,17] on test sensitivity and specificity.on test sensitivity and specificity.Weakly informative prior distributions were used for all model coefficients (Statistical supplementary appendix).Estimates of the posterior distributions were obtained via Hamiltonian Markov Chain Monte Carlo using R version 4.2 [18] and Stan 2.27 [19].Convergence was evaluated by the Brooks-Gelman-Rubin R statistic.All results are presented as medians and equal-tailed 95% credible intervals (CrI) unless specified otherwise.
We compared different models and selected the best one using the expected log point-wise predictive density (elpd) [20].To give an informal insight, we validated our full and simplified prevalence model with the final hospital diagnosis as a pseudo gold standard-in which a patient is considered as having TBM if they were diagnosed with suspected or definite TBM.A proper validation of the selected model was performed on the observed confirmatory test results and is explained in the Statistical supplementary appendix, in which we also compared our estimated TBM probability with standard UCD [5].All performance metrics were calculated using repeated cross-validation procedures.

Clinical characteristics
Of the 692 participants, 659 were included in the analysis (Fig. 2).All participants had at least one symptom suggestive of a brain infection (headache, fever, neck stiffness, vomiting, convulsion).Characteristics of included individuals are summarised in Table 1, stratified by overall and specific confirmatory test results.The median age of included individuals was 40 years.HIV co-infection status was assessed for 448/659 (68%) individuals, amongst whom 50 (11%) were positive.For each of the confirmatory tests, both HIV prevalence and duration from symptom onset to hospitalisation were higher in the test-positive group than in the negative and missing groups.There were 355 participants with valid TBM confirmatory test results, of whom 138 were microbiologically confirmed TBM (Fig. 3), while 118 were later confirmed with another disease.Amongst the 304 patients who were not tested microbiologically for TBM, 220 were confirmed with another pathogen.For 145 patients, from both the tested and the untested groups, no confirmed diagnosis could be made.

Model estimates
Estimated coefficients for TBM diagnostic features in the prevalence model are shown in Fig. 4 and the Clinical supplementary appendix.HIV infection, miliary TB, and longer symptom duration were associated with a higher risk of TBM (median OR = 9.9, 166.2, and 1.9, respectively).Of the laboratory variables, there was a strong increase in TBM risk with higher lymphocyte count (OR = 6.4,95% CrI 1.2 -69.1) and weaker increase with lower paired blood glucose (OR = 0.3, 0.1 -1.2), lower CSF glucose (OR = 0.9, 0.5 -1.5), higher CSF protein (OR = 1.2, 0.7 -2.3), and higher lactate (OR = 2.3, 0.8 -6.6).The relationship between CSF WCC and TBM risk was nonlinear, peaking at 238 cells per mm 3 of CSF.HIV infection and higher CSF protein were associated with higher mycobacillary burden, while higher Glasgow coma score (GCS), CSF glucose, WCC and lymphocyte count all associated with lower mycobacillary burden.
Estimates of sensitivity and specificity of ZN-Smear, MGIT, and Xpert in diagnosing TBM are reported in Table 2.The specificity of each is > 99%.ZN-Smear was the most sensitive test with sensitivity of 62.5% (47.8%-80.9%)for people without HIV and 89.0%(81.5%-94.6%)for those with HIV.Xpert was the least sensitive, especially in the former group, at 33.8% (24.9%-45.7%).The sensitivity of Xpert was much higher in the group with HIV, at 75.5% (64.7%-84.3%)and comparable to MGIT sensitivity.

Model pseudo-performance on diagnosing TBM
Our selected indicator model showed good discrimination with the Areas under the Receiver-operating-characteristic curve (AUCs) of 92.6% for ZN Smear, 93.0% for MGIT, and 96.1% for Xpert (Statistical supplementary appendix).Assuming that the final hospital diagnosis was perfect, our selected prevalence model showed good discrimination with AUC = 93.9%(Fig. 5A).At the cutpoint maximising Youden's index of 36%, the model was 86% sensitive and 88% specific.Our prevalence model was also well-calibrated, with calibration intercept = 0.051 and slope = 1 [21].The calibration curve also showed a good correspondence between the estimated probability and the observed diagnosis of TBM (Fig. 5B).All fold-wise ROC and calibration curves showed consistent behaviour.

Updated TBM risk if some or all confirmatory tests are negative
The changes in TBM risk relative to the pre-test values are shown in Table 3.A negative ZN-Smear alone halved the probability of TBM regardless of HIV status.
A negative ZN-Smear and Xpert were associated with a 3.6-fold reduction in TBM risk for participants with HIV and a 2.4-fold reduction for those without HIV.With all tests negative, the TBM probability was reduced by 2.7 and 4.7, respectively for the two groups above.

Simplified model
Most included diagnostic features in the simplified prevalence model associated with TBM status (Fig. 6A).The association was not clear with focal neurological deficit.Three of the four additional symptoms suggested a TBM diagnosis, only psychosis made another diagnosis more likely.
Our simplified model had good AUC.The cutpoint maximising Youden's index is at a probability of having TBM of 30.7%,where our model was 79.0% sensitive and 75.0%specific (Fig. 6B).We used this cutpoint to derive a screening score table by selecting rounded parameter values within the 50% credible intervals of every coefficient.All the scores and the threshold were subsequently multiplied by a factor of 2 to make the calculation easier (Table 4).
The model was also calibrated, with calibration slope = 0.94 and calibration intercept = -0.19.All the averaged and individual curves showed consistent behaviour (Fig. 6C), in which the estimated TBM probability corresponded well to the observed TBM diagnosis made at discharge or death.

Discussions
We used a latent class approach to obtain a diagnostic model of TBM for adults with suspected brain infections.
Upon validation, our model showed good calibration and discrimination.Using cutpoints based on Youden's Index, both our full and simplified prevalence models surpassed the most sensitive confirmatory test ZN-Smear (86% and 79%, compared with 65% of ZN-Smear), while still obtaining good specificity (88% and 75%, respectively).The association direction of the diagnostic features and the individual TBM risk mostly followed prior knowledge.There are two differences compared with the UCD [5].Firstly, we enforced a positive coefficient on HIV infection.Secondly, we found that a higher GCS was associated with a higher probability of TBM.Still, our estimated TBM risk was well-calibrated with the final hospital diagnosis made at discharge or death.This raises confidence that both are accurate.However, our model provides results far earlier in patients' hospitalisation.
A novel quantification in our model is the latent mycobacterial burden.This estimate showed good correspondence with another laboratory-based quantification (Clinical supplementary appendix).We found that higher lymphocyte count in CSF was associated with increased probability of having TBM but with reduced mycobacillary burden.As lower mycobacillary burden is believed to be associated with reduced mortality, this was in line with a previous study, in which higher lymphocyte count was linked to increased survival from TBM [22].
As a by-product, we could re-evaluate the performance of the current microbiological assays.ZN-Smear had the highest sensitivity, 64.6% (95% CrI 50.9%-81.8%),confirming prior studies from our centre and reflecting the high level of technical expertise in conducting the test [1,11].In our laboratory, Xpert performed poorly, especially for individuals without HIV, who on average had low mycobacillary burden.All tests performed much better for those with HIV, who tended to have higher burden.Within this group, ZN-Smear can be used as a reliable diagnostic standard, although the performance and utility are likely to be reduced outside of expert laboratories.
We are not the first to use LCA to help improve TBM diagnosis.A previous study from the Vietnam National Lung Hospital in Hanoi [6] had a different target population and estimated TBM prevalence amongst individuals with TB of any type.They made the strong assumption that all confirmatory test results were mutually independent.We could relax that assumption by including a model for mycobacterial burden.Unlike previous studies [4,6,8], we used non-confirmatory biomarkers as predictors, not as manifest variables.This implementation had two purposes: on the technical side, it lowered the risk of violating the aforementioned assumptions of independence if more manifest variables were included; and on the Our study has some limitations.There were missing data, especially in test results and HIV status.When imputing missing values, we had to make several assumptions.These assumptions, despite the validity checks of the imputation (Statistical supplementary appendix), could have biased the results to some extent.In addition, this is a single-centre study conducted in a specialised brain infections centre.The prevalence of individuals with severe disease may be lower in other centres.Also, although the clinicians' s assessment and diagnosis skills  were supposedly high and followed a standardised guideline, we could not rule out the risk of biases induced by personal judgement, especially in detecting clinical symptoms.The levels of clinical and laboratory expertise at our tertiary hospital may also be higher; especially the performance of CSF ZN-smear may not generalise well to other laboratories.There is less inter-laboratory variability in the performance of CSF Xpert and culture, but this does not detract from the need for external validation of our findings.In both cases, it highlights the benefit of a diagnostic tool that is less sensitive to expert judgement -which our prevalence models provide.
In conclusion, despite many past attempts to quantify microbiological test performance and develop diagnostic methods for TBM, this is amongst the first studies to utilise LCM and rigorously validate many assumptions  To simplify to calculation, the estimates were doubled before choosing the scores.The decision threshold was decided by the cutpoint in the total score that maximises Youden's Index Continuous features were cut into bins so that each bin accounts for 1 incremental points

Fig. 1
Fig. 1 Basic model design.Unknown TBM status is linked with test results.The probability of a positive test depends on bacillary burden, which in turn depends on modulating factors.TBM risk factors help determining an individual's TBM status.The distribution of test results shown on the bar plots are for demonstration only and do not correspond to the actual numbers (a) only Smear, (b) only Xpert, (c) Smear and MGIT, (d) Smear and Xpert, and (e) all three tests.

0
In the "Any test" stratum: positive where at least one amongst ZN-Smear, MGIT, or Xpert is positive.ZN-Smear, MGIT, and Xpert are the sub-populations where the respective microbacterial test was performed (N (cell(s)/mm3

Fig. 3 Fig. 4
Fig. 3 Venn diagram for ZN-Smear, MGIT, and Xpert profile in the study population

Fig. 6
Fig. 6 Posterior estimates and performance of the simplified prevalence model, assuming the final hospital diagnosis is the true status.A Posterior estimates of coefficients of clinical TBM risk factors.Points, thick and thin lines are medians, 50% and 95% credible intervals.B ROC plot and AUC.AUC values are presented as "average (min-max over 20 repetitions of cross-validation)".C Calibration plot, showing the relationship between the predicted probability and observe outcome, smoothed by a loess curve.The grey lines are fitted curves from each 20-fold cross validation and coloured lines represents their average.The cross-validation procedure is explained in the Statistical supplementary appendix

Table 2
Posterior estimates of test specificities and sensitivities to diagnose TBM, for overall population, and stratified by HIV infection.Test specificities are the same for all strata Calibration plot, showing the relationship between the predicted probability and observe outcome, smoothed by a loess curve.The grey lines are fitted curves from each 20-fold cross validation and coloured lines represents their average.The cross-validation procedure is explained in the Statistical supplementary appendix Fig.5of the selected prevalence model, assuming the final hospital diagnosis is the true status.A ROC curve and AUC: AUC values are presented as "average (min-max over 5 repetitions of cross-validation)"; B

Table 3
Change in probability of TBM after one or more negative confirmatory test results, relative to TBM probability when no test results are known Values are in the form Median (95% Credible interval) Assuming pre-test risk is 1, the values measure post-test risk in each scenarioIn each scenario, "-" represents a negative test result and "?" represents an unknown (or unretrieved) test result

Table 4
TBM risk score table for screening, derived from the simplified prevalence model TBM suspected: Total score > = 14